This document details the analyses and provides results for the primary analyses of the VIBRANT trial.
Trial data
The data used for these analyses are stored in an R Bioconductor MultiAssayExperiment object which contains the QCed and pre-processed data. The code and description of how raw data have been processed and QCed is available on the VIBRANT-0-QC-and-preprocessing GitHub repository (Still private at the moment, but will be made public upon publication, and anyone wishing to have access can email Laura Symul).
Loading the RDS file containing the unblinded VIBRANT MAE: /04 unblinded MAEs/mae_full_20250715.rds
Code
mae <-readRDS(mae_file)rm(mae_file)fs::dir_ls(str_c(data_dir(), "01 Preprocessed and QCed/"), regexp ="exposures_.*\\.Rdata$") |>sort(decreasing =TRUE) |> magrittr::extract(1) |>load()
Demographics (Table 1 & suppl. tables by sites)
Summary table
The demographics table is built as follow.
We first create a table1_datadata.frame from the participant_crfs_merged table stored in the @metadata slot of the mae.
This table contains almost all variables needed to create the demographics table (i.e., site, age, race, ethn, cut_size_meals, eat_less, hungry_did_not_eat, sexual_partners_lifetime, sexual_partners_past_month, contraception_method).
In addition, the mae@colData contains the participants’ arm and study population flags (here we keep the mITT population).
From the variables listed above, we create the variable food_insecurity using the three variables cut_size_meals, eat_less and hungry_did_not_eat: if a participant answered “Yes” to any of the corresponding questions, then she is considered to be affected by food insecurity.
table1_data |>ggplot(aes(x = food_insecurity)) +geom_bar() +labs(x="Food insecurity", y ="Count")+theme_classic()
In addition to these variables, we also include the participants’ partner’s gender.
Participants are asked about their partner’s gender at all weekly visits (1000, 1100, 1200, 1300, 1400, 1500, 1700, 1900, 2120) using CRF 19.
Answers to that question can be found in the past_week_sex_partner column of the mae@metadata$visits_crfs_merged table. Participants’ answer throughout the weekly visits will be summarized into one of these 4 categories:
No sex
Only male
Only female
Both
This variable is added to the table1_data table.
Code
gender_sexual_partner <-metadata(mae)$visits_crfs_merged |>select(pid, past_week_sex_partner) |>filter(!is.na(past_week_sex_partner)) |>group_by(pid) |>summarise(gender_sexual_partner =case_when(all(past_week_sex_partner =="No sex in the past week") ~"No sex",any(past_week_sex_partner %in%c("A single male partner", "Multiple male partners")) &!any(past_week_sex_partner %in%c("A single female partner", "Multiple female partners")) ~"Only man",any(past_week_sex_partner %in%c("A single female partner", "Multiple female partners")) &!any(past_week_sex_partner %in%c("A single male partner", "Multiple male partners")) ~"Only female",any(past_week_sex_partner %in%c("A single male partner", "Multiple male partners")) &any(past_week_sex_partner %in%c("A single female partner", "Multiple female partners")) ~"Both",TRUE~NA_character_ ),.groups ="drop" )table1_data <- table1_data |>left_join(gender_sexual_partner, by ="pid")
We also re-code the race variable as follows:
“Asian or South Asian” is re-coded to “Asian” (for brevity)
“Black, African American, or African” and “African” are re-coded to “Black/African” (for brevity)
“Coloured”, “White”, “Other”, and “Prefered not to answer” are left as is.
Note
@Disebo & Caroline: do we go for British or American English? (Colored vs Coloured)?
Code
table1_data <- table1_data |>mutate(across(where(is.character), as.factor),race =fct_recode( race,"Asian"="Asian or South Asian","Black/African"="Black, African American, or African","Black/African"="African","Coloured"="Coloured","White"="White","Other"="Other","Prefer not to answer"="Prefer not to answer" ),race =fct_relevel(race, "Asian", "Black/African", "Coloured", "White", "Other", "Prefer not to answer"),race = race |>fct_drop(),arm = arm |>fct_drop() ) |>set_variable_labels(food_insecurity ="Report food insecurity in past 12 months ",sexual_partners_past_month ="Number of partners in past month",sexual_partners_lifetime ="Number of partners in lifetime",ethn ="Ethnicity", contraception_method ="Contraception", gender_sexual_partner ="Gender of sexual partners" )
Warning
@Laura Vermeren: I changed the summary statistics from mean to median for the continuous variables (for number of past sexual partner, the mean was artificially high in the arm with the 90 partner MGH participant). Would you be able to check that the tests selected by the gt_summary package are appropriate for the data we have? They should be, but I’d love if we could double-check :) >LV : I check and I think it is good :)
2 Kruskal-Wallis rank sum test; Fisher’s exact test
Study population characteristics appear balanced across arms at the CAPRISA site. Given the low number of participants at the MGH site in two of the five arms, we create an additional table excluding these two arms.
2 Kruskal-Wallis rank sum test; Fisher’s exact test
Besides the number of lifetime sexual partners, arms appear balanced across arm at MGH.
Primary outcomes (Table 2)
LBP colonization by week 5
The primary outcome is defined as “Colonization by any of the LBP strains after product administration by week 5”, and colonization by any of the LBP strain is defined as a relative abundance of 5% by any single strain or 10% total relative abundance of the LBP strains. The relative abundance of LBP strain is estimated using metagenomics (taxonomic composition estimated using VIRGO2 and strain proportion of total L. crispatus is estimated using kSanity), and includes samples from weekly visits 1200 to 1500 (week 2 (post-INT) to week 5 (3 weeks post-INT)).
test_results <- fisher_test_results |>mutate(adj_p_value = adj_p_value |> scales::pvalue(accuracy =0.001),OR = OR |>round(2) |>as.character(),CI =str_c(lower_CI |>round(2), " - ", upper_CI |>round(2)) ) |>select(arm, adj_p_value, OR, CI) |>pivot_longer(cols =-arm, names_to ="name", values_to ="value") |>pivot_wider(names_from = arm, values_from = value) |>mutate(name =case_when( name =="adj_p_value"~"Adj. p-value", name =="OR"~"OR", name =="CI"~"95% CI",TRUE~ name ) )
Code
table_2_with_model_results <-bind_rows( table_2, test_results |>filter(name !="Adj. p-value") ) |>mutate(site = site |>str_replace_na(""),Pl =case_when( name =="Adj. p-value"~"Ref.", name =="OR"~"Ref.",str_detect(name, "CI") ~"",TRUE~ Pl ) )
Code
table_2_with_model_results |>group_by(site) |>gt(caption ="LBP Colonization by week 5 by arm, with colonization definition based on qPCR-estimated relative abundance.", row_group_as_column =TRUE ) |>cols_width("name"~px(150),everything() ~px(120) ) |>cols_label(name ="",Pl ="Placebo",`LC-106-7`="LC-106<br>7 days",`LC-106-3`="LC-106<br>3 days",`LC-106-o`="LC-106<br>overlap",`LC-115`="LC-115<br>7 days",.fn = md )
LBP Colonization by week 5 by arm, with colonization definition based on qPCR-estimated relative abundance.
Placebo
LC-106 7 days
LC-106 3 days
LC-106 overlap
LC-115 7 days
CAP
N participants
N = 14
N = 14
N = 14
N = 14
N = 14
n (%) participants with LBP strain detected
2 (14 %)
11 (79 %)
8 (57 %)
8 (57 %)
10 (71 %)
95% CI
49% - 95%
29% - 82%
29% - 82%
42% - 92%
MGH
N participants
N = 5
N = 7
N = 1
N = 1
N = 6
n (%) participants with LBP strain detected
1 (20 %)
6 (86 %)
1 (100 %)
1 (100 %)
6 (100 %)
95% CI
42% - 100%
3% - 100%
3% - 100%
54% - 100%
OR
Ref.
20.25
7.44
7.44
19.07
95% CI
4.44 - Inf
1.62 - Inf
1.62 - Inf
4.15 - Inf
LBP detection by week 5, qPCR-based detection
In this analysis, LBP is considered to be detected if at least 2 LBP strains had observed Cq values at the same visit by week 5. “By week 5” includes all post-product visits from visit_code = 1200 (no excluding the immediate post-product visits.
detected_by_week5_qPCR |>arrange(site, arm, LBP_detected_by_week5) |>group_by(site, arm) |>mutate(participant_number =row_number() |>factor()) |>ungroup() |>ggplot(aes(x = participant_number, y = site |>fct_rev(), col = site, fill = site, shape = LBP_detected_by_week5)) +geom_point(size =3) +facet_grid(. ~ arm, scales ="free_x", space ="free_x") +guides(color ="none", fill ="none") +scale_shape_manual("Colonized by LBP by week 5", values =c(1, 16)) +scale_color_manual(values = site_colors) +xlab("Participant number") +ylab("") +ggtitle("Colonization based on qPCR-based detection of at least 2 LBP strains detected at the same visit by week 5") +theme(legend.position ="top",axis.text.x =element_text(angle =90, hjust =1, vjust =0.5), plot.title =element_text(hjust =0.5) )
test_results <- fisher_test_results |>mutate(adj_p_value = adj_p_value |> scales::pvalue(accuracy =0.001),OR = OR |>round(2) |>as.character(),CI =str_c(lower_CI |>round(2), " - ", upper_CI |>round(2)) ) |>select(arm, adj_p_value, OR, CI) |>pivot_longer(cols =-arm, names_to ="name", values_to ="value") |>pivot_wider(names_from = arm, values_from = value) |>mutate(name =case_when( name =="adj_p_value"~"Adj. p-value", name =="OR"~"OR", name =="CI"~"95% CI",TRUE~ name ) )
Code
table_2_with_model_results <-bind_rows( table_2, test_results |>filter(name !="Adj. p-value") ) |>mutate(site = site |>str_replace_na(""),Pl =case_when( name =="Adj. p-value"~"Ref.", name =="OR"~"Ref.",str_detect(name, "CI") ~"",TRUE~ Pl ) )
Code
table_2_with_model_results |>group_by(site) |>gt(caption ="LBP detection by week 5 by arm, with LBP detection defined as at least two LBP strains simultaneously detected by qPCR.", row_group_as_column =TRUE ) |>cols_width("name"~px(150),everything() ~px(120) ) |>cols_label(name ="",Pl ="Placebo",`LC-106-7`="LC-106<br>7 days",`LC-106-3`="LC-106<br>3 days",`LC-106-o`="LC-106<br>overlap",`LC-115`="LC-115<br>7 days",.fn = md )
LBP detection by week 5 by arm, with LBP detection defined as at least two LBP strains simultaneously detected by qPCR.
Placebo
LC-106 7 days
LC-106 3 days
LC-106 overlap
LC-115 7 days
CAP
N participants
N = 14
N = 14
N = 14
N = 14
N = 14
n (%) participants with LBP strain detected
2 (14 %)
12 (86 %)
8 (57 %)
9 (64 %)
11 (79 %)
95% CI
57% - 98%
29% - 82%
35% - 87%
49% - 95%
MGH
N participants
N = 5
N = 7
N = 1
N = 1
N = 6
n (%) participants with LBP strain detected
3 (60 %)
6 (86 %)
1 (100 %)
1 (100 %)
6 (100 %)
95% CI
42% - 100%
3% - 100%
3% - 100%
54% - 100%
OR
Ref.
15.28
4.01
5.29
14.44
95% CI
3.49 - Inf
0.99 - Inf
1.29 - Inf
3.28 - Inf
L. crispatus dominance by week 5
In this analysis, L. crispatus dominance is defined as a metagenomic-estimated relative abundance ≥ 50%. “By week 5” includes all post-product visits from visit_code = 1200 (no excluding the immediate post-product visits.
g_tot_prop_LBP_strains +geom_label(aes(label = pid_nb), size =2) +labs(caption ="Participants are labeled by their enrollment order within each arm and site." )
Proportions of participants who colonized in each arm and each visit
g_tot_prop_Lc +geom_label(aes(label = pid_nb), size =2) +labs(caption ="Participants are labeled by their enrollment order within each arm and site." )
Proportions of participants who colonized with 50% L. crispatus in each arm and each visit
Microbiota trajectories of mITT participants in the VIBRANT study. The relative abundances of the LBP strains and top taxa are shown. Participants are ordered by their total relative abundance of LBP strains, and the study weeks are shown in the rows. The alpha level (transparency) indicates whether the participant is part of the per-protocol (PP) or modified intention-to-treat (mITT) population.
Code
# checking the data for the placebo participant that has the LBP strains at MGHmae[["mg"]] |>as_tibble() |> dplyr::left_join(mae@colData |>as_tibble()) |>filter(pid =="068100062", study_week ==5) mae[["qPCR"]] |>as_tibble() |> dplyr::left_join(mae@colData |>as_tibble()) |>filter(pid =="068100062", study_week ==5) mae[["amplicon"]] |>as_tibble() |> dplyr::left_join(mae@colData |>as_tibble()) |>filter(pid =="068100062", study_week ==5) |>select(.feature, .sample, rel_ab, exclude_sample)mae[["amplicon"]] |>as_tibble() |> dplyr::left_join(mae@colData |>as_tibble()) |>filter(pid =="068100062", .feature =="Lactobacillus_crispatus") |>select(.feature, .sample, visit_code, rel_ab, exclude_sample)
tmp <-# we use the metagenomics data for this mae[["mg"]] |>as_tibble() |>filter(!is.na(LBP), !poor_quality_mg_data) |>select(.feature, LBP, strain_origin, .sample, uid, rel_ab) |># add the arm, mITT, etc.left_join( mae |>colData() |>as_tibble() |>select(uid, pid, site, randomized, mITT, arm, visit_code, visit, PIPV), by =join_by(uid) ) |># only consider the mITTfilter(randomized, mITT) |>filter(PIPV, as.numeric(visit_code) >=1200) |># only include visits after the first 2 weeks# define the outcome of interest for each person and straingroup_by(.feature, LBP, strain_origin, pid, arm, site) |>summarize(ever_colonized =any(rel_ab >=0.05, na.rm =TRUE),.groups ="drop" ) |># exclude Placebo participants because, they were, in theory, not exposed to any of the strains.filter(arm !="Pl") |># define the total number of participants exposed for each strain.# For LC-115 strains, this is only participants of arm LC-115# For LC-106 strains, this is participants from all active arms.mutate(exposed =case_when( LBP =="LC-115"~ (arm =="LC-115"), LBP =="LC-106 & LC-115"~TRUE,TRUE~FALSE ) ) |># we compute the statistics of interestgroup_by(.feature, LBP, strain_origin, site) |>summarize(n_exposed =sum(exposed, na.rm =TRUE),n_colonized =sum(ever_colonized & exposed, na.rm =TRUE),.groups ="drop" ) |>mutate(prop_colonized = n_colonized / n_exposed,CI_low = binom::binom.confint(n_colonized, n_exposed, method ="wilson")$lower,CI_high = binom::binom.confint(n_colonized, n_exposed, method ="wilson")$upper )
Code
tmp |>ggplot() +aes(x = .feature, y = prop_colonized, col = site) +facet_grid(. ~ LBP + strain_origin, scales ="free", space ="free") +geom_errorbar(aes(ymin = CI_low, ymax = CI_high), linewidth =1, alpha =0.7,width =0.3, position =position_dodge(width =0.5) ) +geom_point(position =position_dodge(width =0.5), size =2) +scale_color_manual("Study site", values = site_colors) +xlab("LBP strain") +ylab("Proportion of participants colonized\namong those exposed") +theme(axis.text.x =element_text(angle =45, hjust =1) )
Code
tmp |>mutate(CI =str_c("[", scales::percent(CI_low, accuracy =1), " - ", scales::percent(CI_high, accuracy =1), "]" ),value =str_c( scales::percent(prop_colonized, accuracy =1), " (", n_colonized, "/", n_exposed,")<br>", CI ),strain_info =str_c(LBP, "<br>(", strain_origin, " strains)") ) |>select(-n_exposed, -n_colonized, -prop_colonized, -CI_low, -CI_high, -CI, -LBP, -strain_origin) |>pivot_wider(names_from =c(site), values_from = value ) |>arrange(strain_info) |>group_by(strain_info) |>gt(row_group_as_column =TRUE, process_md =TRUE,caption ="Proportion of mITT participants exposed to a given strain who colonized with that strain (colonization defined as a ≥ 5% relative abundance by metagenomics) at any weekly visit after product exposure." ) |>cols_label(.feature ="LBP strain") |>cols_align(columns = .feature, align ="left") |>cols_align(columns =c("CAP", "MGH"), align ="center") |>fmt_markdown(columns =everything())
Proportion of mITT participants exposed to a given strain who colonized with that strain (colonization defined as a ≥ 5% relative abundance by metagenomics) at any weekly visit after product exposure.
LBP strain
CAP
MGH
LC-106 & LC-115 (SA strains)
FF00018
7% (4/56) [3% - 17%]
20% (3/15) [7% - 45%]
FF00051
30% (17/56) [20% - 43%]
67% (10/15) [42% - 85%]
UC101
25% (14/56) [16% - 38%]
73% (11/15) [48% - 89%]
LC-106 & LC-115 (US strains)
C0022A1
25% (14/56) [16% - 38%]
47% (7/15) [25% - 70%]
C0059E1
61% (34/56) [48% - 72%]
87% (13/15) [62% - 96%]
C0175A1
39% (22/56) [28% - 52%]
67% (10/15) [42% - 85%]
LC-115 (SA strains)
122010
57% (8/14) [33% - 79%]
33% (2/6) [10% - 70%]
185329
0% (0/14) [0% - 22%]
0% (0/6) [0% - 39%]
FF00004
50% (7/14) [27% - 73%]
83% (5/6) [44% - 97%]
FF00064
50% (7/14) [27% - 73%]
83% (5/6) [44% - 97%]
FF00072
7% (1/14) [1% - 31%]
50% (3/6) [19% - 81%]
UC119
0% (0/14) [0% - 22%]
0% (0/6) [0% - 39%]
LC-115 (US strains)
C0006A1
50% (7/14) [27% - 73%]
67% (4/6) [30% - 90%]
C0028A1
0% (0/14) [0% - 22%]
0% (0/6) [0% - 39%]
C0112A1
64% (9/14) [39% - 84%]
83% (5/6) [44% - 97%]
Excluding immediate post-product visit
Code
tmp <-# we use the metagenomics data for this mae[["mg"]] |>as_tibble() |>filter(!is.na(LBP), !poor_quality_mg_data) |>select(.feature, LBP, strain_origin, .sample, uid, rel_ab) |># add the arm, mITT, etc.left_join( mae |>colData() |>as_tibble() |>select(uid, pid, site, randomized, mITT, arm, visit_code, visit, PIPV), by =join_by(uid) ) |># only consider the mITTfilter(randomized, mITT) |>mutate(include_visit =case_when( (arm %in%c("LC-106-7", "LC-115")) &as.numeric(visit_code) >1200~TRUE, (arm %in%c("LC-106-3", "LC-106-o")) &as.numeric(visit_code) >=1200~TRUE,TRUE~FALSE ) ) |>filter(PIPV, include_visit) |># only include visits after the first 2 weeks# define the outcome of interest for each person and straingroup_by(.feature, LBP, strain_origin, pid, arm, site) |>summarize(ever_colonized =any(rel_ab >=0.05, na.rm =TRUE),.groups ="drop" ) |># exclude Placebo participants because, they were, in theory, not exposed to any of the strains.filter(arm !="Pl") |># define the total number of participants exposed for each strain.# For LC-115 strains, this is only participants of arm LC-115# For LC-106 strains, this is participants from all active arms.mutate(exposed =case_when( LBP =="LC-115"~ (arm =="LC-115"), LBP =="LC-106 & LC-115"~TRUE,TRUE~FALSE ) ) |># we compute the statistics of interestgroup_by(.feature, LBP, strain_origin, site) |>summarize(n_exposed =sum(exposed, na.rm =TRUE),n_colonized =sum(ever_colonized & exposed, na.rm =TRUE),.groups ="drop" ) |>mutate(prop_colonized = n_colonized / n_exposed,CI_low = binom::binom.confint(n_colonized, n_exposed, method ="wilson")$lower,CI_high = binom::binom.confint(n_colonized, n_exposed, method ="wilson")$upper )
Code
tmp |>ggplot() +aes(x = .feature, y = prop_colonized, col = site) +facet_grid(. ~ LBP + strain_origin, scales ="free", space ="free") +geom_errorbar(aes(ymin = CI_low, ymax = CI_high), linewidth =1, alpha =0.7,width =0.3, position =position_dodge(width =0.5) ) +geom_point(position =position_dodge(width =0.5), size =2) +scale_color_manual("Study site", values = site_colors) +xlab("LBP strain") +ylab("Proportion of participants colonized\namong those exposed") +theme(axis.text.x =element_text(angle =45, hjust =1) )
Code
tmp |>mutate(CI =str_c("[", scales::percent(CI_low, accuracy =1), " - ", scales::percent(CI_high, accuracy =1), "]" ),value =str_c( scales::percent(prop_colonized, accuracy =1), " (", n_colonized, "/", n_exposed,")<br>", CI ),strain_info =str_c(LBP, "<br>(", strain_origin, " strains)") ) |>select(-n_exposed, -n_colonized, -prop_colonized, -CI_low, -CI_high, -CI, -LBP, -strain_origin) |>pivot_wider(names_from =c(site), values_from = value ) |>arrange(strain_info) |>group_by(strain_info) |>gt(row_group_as_column =TRUE, process_md =TRUE,caption ="Proportion of mITT participants exposed to a given strain who colonized with that strain (colonization defined as a ≥ 5% relative abundance by metagenomics) at any weekly visit after product exposure but excluding immediate post-product visit (i.e., excluding week 2 visit for the LC-106-7 and LC-115 arms)." ) |>cols_label(.feature ="LBP strain") |>cols_align(columns = .feature, align ="left") |>cols_align(columns =c("CAP", "MGH"), align ="center") |>fmt_markdown(columns =everything())
Proportion of mITT participants exposed to a given strain who colonized with that strain (colonization defined as a ≥ 5% relative abundance by metagenomics) at any weekly visit after product exposure but excluding immediate post-product visit (i.e., excluding week 2 visit for the LC-106-7 and LC-115 arms).
LBP strain
CAP
MGH
LC-106 & LC-115 (SA strains)
FF00018
2% (1/56) [0% - 9%]
7% (1/15) [1% - 30%]
FF00051
9% (5/56) [4% - 19%]
20% (3/15) [7% - 45%]
UC101
2% (1/56) [0% - 9%]
7% (1/15) [1% - 30%]
LC-106 & LC-115 (US strains)
C0022A1
21% (12/56) [13% - 34%]
33% (5/15) [15% - 58%]
C0059E1
43% (24/56) [31% - 56%]
40% (6/15) [20% - 64%]
C0175A1
39% (22/56) [28% - 52%]
67% (10/15) [42% - 85%]
LC-115 (SA strains)
122010
36% (5/14) [16% - 61%]
17% (1/6) [3% - 56%]
185329
0% (0/14) [0% - 22%]
0% (0/6) [0% - 39%]
FF00004
0% (0/14) [0% - 22%]
0% (0/6) [0% - 39%]
FF00064
0% (0/14) [0% - 22%]
0% (0/6) [0% - 39%]
FF00072
0% (0/14) [0% - 22%]
0% (0/6) [0% - 39%]
UC119
0% (0/14) [0% - 22%]
0% (0/6) [0% - 39%]
LC-115 (US strains)
C0006A1
0% (0/14) [0% - 22%]
0% (0/6) [0% - 39%]
C0028A1
0% (0/14) [0% - 22%]
0% (0/6) [0% - 39%]
C0112A1
0% (0/14) [0% - 22%]
0% (0/6) [0% - 39%]
Supplementary analyses
Long terme colonization
“Among people with detection of the LBP strains during the first 5 weeks, approximately 40-60% still had detection at week 12 (Figure 3b), demonstrating prolonged durability despite a relatively short initial dosing.”
Among participants colonized by week 5, how many are still colonized at week 12?
We load data on stained applicators from CRF46 and data on returned applicators from CRF23. We also load the exposures table to obtain information on study product use. Then, we create the applicator table, which includes only participants from the mITT population.
Some participants returned applicators at two different visits. According to the CAPRISA Vulindlela Pharmacy, used and remaining applicators, along with the study product, were given back to participants after the first visit, and they were expected to return all applicators at the second visit. For this reason, when applicators were returned at both visits, we kept only the data from the second visit.
Two participants returned eight applicators. In both cases, comments explain that they had received a spare applicator:
Participant 068-20-0425 reported taking the spare applicator out of its wrapping and placing it in the return bag without using or washing any applicators.
Participant 068-20-0439 reported inserting the spare applicator on the day of the visit, thinking all applicators had to be used, but also did not wash any of them.
Code
applicator |>ggplot(aes(x = visit_code, y = pid, fill = number_applicators |>as.factor())) +geom_tile() +geom_text(aes(label = applicator_stain_positive), color ="black", size =2) +scale_fill_manual( name ="Number of applicators \nreturned",values =c("3"="lightblue3", "4"="lightblue", "5"="lightblue1", "6"="lightgreen", "7"="green3", "8"="darkgreen" ), na.translate =FALSE) +labs(x ="Visit code",y ="Participant ID",title ="Number of applicator returned (out of number applicator stained positive)") +theme(axis.text.y =element_text(size=4))
Time since last dose in 7-days active arms (LC-106-7, LC-115)
We compute this time based on self-reported time of last study product use (daily - CRF plate 32/33) -> date/time of last study product = use the crf 32/33 visit_day + answer to when? (early morning, etc.) -> swab time = use study_day of the weekly visit.
To compute the time between the last dose taken prior to the in-person visit at week 2, we use information from three sources: crf32 for the time of insertion of the study product, the exposure table for the date of the last dose, and the visits_crfs_merged table for the study day of the week 2 in-person visit.
To better estimate this time interval, we combine the study_day (which reflects the calendar day of product use) with insertion_study_product_time to account for the time of day the product was inserted. Specifically, we convert the reported insertion window into a fractional day value and add it to the study day, as follows:
Before 6am: 0.125
Early morning (6–9am): 0.25
Late morning (9–12pm): 0.4375
Early afternoon (12–3pm): 0.5625
Late afternoon (3–6pm): 0.6875
Evening (6–9pm): 0.8125
Late evening (9pm–midnight): 0.9375
For the week 2 in-person visit, we also assume it occurred during the day (rather than at midnight), so we add 0.5 to the visit’s study_day to reflect a likely early afternoon visit time.
study_prod_time |>ggplot() +aes(x = time_between_final_dose_and_v2*24, y = pid) +geom_point() +geom_vline(xintercept =c(24, 48, 72, 96), linetype ="dashed", color ="steelblue") +scale_x_continuous(breaks =seq(0, 24*4, by =12), name ="Time between final dose and visit 2 (hours)" ) +facet_grid(site~arm, scales ="free_y", space ="free_y")
LBP colonization in participants with BV post-MTZ
see “After oral metronidazole treatment, 22/70 (30%) people still had a Nugent score of 7 or higher, indicating a failure to resolve BV. Among the 16 of these people exposed to LBP, ZZ had a detectable LBP strain identified after treatment by metagenomics, and TT by qPCR during treatment.”
The Nugent score is available in CRF14. We are interested in the Nugent score value at the visit following metronidazole treatment (visit 1100).
We first create a table containing the Nugent score for all participants at all visits. We then create a variable nugent_positive, which is TRUE if nugent_total_score >= 7 and FALSE otherwise. We also create a variable active_arm to group participant by active arm (i.e. LC106-o, LC-106-7, LC-106-3, LC-115) vs Placebo
At visit 1100, among participants with BV who were exposed to the LBP (i.e., in the active arm), 5 participants had a detectable LBP strain identified after treatment by metagenomics.
Code
# among the 12 of these people exposed to LBPnugent_score |>filter(visit_code =="1100") |>filter(active_arm) |>filter(nugent_positive) |>left_join( mae[["mg"]] |>as_tibble() |># add mg data dplyr::left_join( mae@colData |>as_tibble() |>select(uid, pid, visit_code),by =join_by(.sample == uid) ) |>filter(visit_code =="1200"),by =c("pid") ) |>filter(!is.na(LBP)) |>#only LBP straingroup_by(pid) |>summarise(prop_tot_LBP =sum(rel_ab)) |>arrange(prop_tot_LBP |>desc())
#nugent score of active arm at visit 1100 nugent_score |>filter(visit_code =="1100") |>group_by(arm) |>summarise(n =n(),n_positive =sum(nugent_positive, na.rm =TRUE),p = n_positive / n,CI = binom::binom.confint(n_positive, n, method ="exact") ) |>mutate(pct =round(p *100, 1),CI_text =str_c("(", round(CI$lower *100, 1), "%–", round(CI$upper *100, 1), "%)") ) |>select(-CI, -p) |>gt() |>tab_header(title ="Proportion of Participants with Nugent Score ≥ 7 at Visit 1100 (week1)" ) |>cols_label(n ="N participants",n_positive ="n (%) participants with nugent score >= 7",pct ="Percentage",CI_text ="95% CI" )
Proportion of Participants with Nugent Score ≥ 7 at Visit 1100 (week1)
arm
N participants
n (%) participants with nugent score >= 7
Percentage
95% CI
Pl
19
6
31.6
(12.6%–56.6%)
LC-106-7
21
3
14.3
(3%–36.3%)
LC-106-3
15
6
40.0
(16.3%–67.7%)
LC-106-o
15
1
6.7
(0.2%–31.9%)
LC-115
20
2
10.0
(1.2%–31.7%)
See “If we combine all active arms vs. placebo, at 5 weeks recurrence was 33% vs. 67% (p = 0.05) and at 12 weeks 44% vs. 70% (p = 0.13). Participants who achieved the primary outcome of LBP detection were less likely to have BV recurrence during the 12 week follow up: YY/YY vs. YY/YY who did not colonize (p = YY).”
Proportion of Participants with Nugent Score ≥ 7 at Visit 2120 (week12)
active_arm
N participants
n (%) participants with nugent score >= 7
Percentage
95% CI
FALSE
17
12
70.6
(44%–89.7%)
TRUE
69
23
33.3
(22.4%–45.7%)
Participants who achieved the primary outcome of LBP detection were less likely to have BV recurrence during the 12 week follow up: YY/YY vs. YY/YY who did not colonize (p = YY).
See “However, an ideal product would durably colonize the majority of people, and given that LBP strains were detectable in only 34% of active arm participants at 12 weeks this product still needs optimization”